{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8a4ce07d",
   "metadata": {},
   "source": [
    "# Tutorial: Data Science\n",
    "\n",
    "In this tutorial, we will introduce Solara from the perspective of a data scientist or when you are thinking of using Solara for a data science app.\n",
    "It is therefore focused on data (Pandas), visualizations (plotly) and how to add interactivity.\n",
    "\n",
    "## You should know\n",
    "This tutorial will assume:\n",
    "\n",
    "  * You have successfully installed Solara\n",
    "  * You know how to display a Solara component in a notebook or script\n",
    "\n",
    "If not, please follow the [Quick start](/documentation/getting_started).\n",
    "\n",
    "## Extra packages you need to install\n",
    "\n",
    "For this tutorial, you need plotly and pandas, you can install them using pip:\n",
    "\n",
    "```\n",
    "$ pip install plotly pandas\n",
    "```\n",
    "\n",
    "*Note: You might want to refresh your browser after installing plotly when using Jupyter.*\n",
    "\n",
    "## You will learn\n",
    "\n",
    "In this tutorial, you will learn:\n",
    "\n",
    "   * [To create a scatter plot using plotly.express](#our-first-scatter-plot)\n",
    "   * [Display your plot in a Solara component](#our-first-scatter-plot).\n",
    "   * [Build a UI to configure the X and Y axis](#configure-the-x-axis).\n",
    "   * [Handle a click event and record which point was clicked on](#interactive-plot).\n",
    "   * [Refactor your code to build a reusable Solara component](#make-a-reusable-component).\n",
    "   * [Compose your newly built component into a larger application](#make-a-reusable-component)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dfe143dc",
   "metadata": {},
   "source": [
    "## The dataset\n",
    "\n",
    "For this tutorial, we will use the [Iris flow data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) which contains the lengths and widths of the petals and sepals of three species of Iris (setosa, virginica and versicolor).\n",
    "\n",
    "This dataset comes with many packages, but since we are doing to use plotly.express for this tutorial, we will use:\n",
    "\n",
    "```python\n",
    "import plotly.express as px\n",
    "df = px.data.iris()\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9e78d49",
   "metadata": {},
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "import plotly.express as px\n",
    "\n",
    "\n",
    "df = px.data.iris()\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0cccd2f7",
   "metadata": {},
   "source": [
    "## Our first scatter plot\n",
    "\n",
    "We use plotly express to create our scatter plot with just a single line.\n",
    "\n",
    "```python\n",
    "fig = px.scatter(df, \"sepal_length\", \"sepal_width\")\n",
    "```\n",
    "\n",
    "To display this figure in a Solara component, we should return an element that can render the plotly figure. [FigurePlotly](/documentation/components/viz/plotly) will do the job for us.\n",
    "\n",
    "Putting this together"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5d51ea0d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import plotly.express as px\n",
    "import solara\n",
    "\n",
    "df = px.data.iris()\n",
    "\n",
    "\n",
    "@solara.component\n",
    "def Page():\n",
    "    fig = px.scatter(df, \"sepal_length\", \"sepal_width\")\n",
    "    solara.FigurePlotly(fig)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b3307cf5",
   "metadata": {},
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "Page()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "704b6061",
   "metadata": {},
   "source": [
    "## Configuring the X-axis\n",
    "\n",
    "To configure the X-axis, first, create a global application state using:\n",
    "\n",
    "```python\n",
    "x_axis = solara.reactive(\"sepal_length\")\n",
    "```\n",
    "\n",
    "This code creates a reactive variable. You can use this reactive variable in your component and pass it to a [`Select`](/documentation/components/input/select) component to control the selected column.\n",
    "\n",
    "\n",
    "```python\n",
    "columns = list(df.columns)\n",
    "solara.Select(label=\"X-axis\", values=columns, value=x_axis)\n",
    "```\n",
    "\n",
    "Now, when the Select component's value changes, it will also update the reactive variable x_axis.\n",
    "\n",
    "If your components use the reactive value to create the plot, for example:\n",
    "\n",
    "\n",
    "```python\n",
    "fig = px.scatter(df, x_axis.value, \"sepal_width\")\n",
    "```\n",
    "\n",
    "The component will automatically re-execute the render function when the `x_axis` value changes, updating the figure accordingly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "860bb9cd",
   "metadata": {},
   "outputs": [],
   "source": [
    "columns = list(df.columns)\n",
    "x_axis = solara.reactive(\"sepal_length\")\n",
    "\n",
    "\n",
    "@solara.component\n",
    "def Page():\n",
    "    # Create a scatter plot by passing \"x_axis.value\" to px.scatter\n",
    "    # This will automatically make the component listen to changes in x_axis\n",
    "    # and re-execute this function when x_axis value changes\n",
    "    fig = px.scatter(df, x_axis.value, \"sepal_width\")\n",
    "    solara.FigurePlotly(fig)\n",
    "\n",
    "    # Pass x_axis to Select component\n",
    "    # The select will control the x_axis reactive variable\n",
    "    solara.Select(label=\"X-axis\", value=x_axis, values=columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0cb44e1d",
   "metadata": {},
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "Page()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c8f3895",
   "metadata": {},
   "source": [
    "### Understanding (optional)\n",
    "\n",
    "#### State\n",
    "\n",
    "Understanding state management and how Solara re-renders component is crucial for understanding building larger applications. If you don't fully graps it now, that is ok. You should first get used to the pattern, and consider reading [About state management](/documentation/getting_started/fundamentals/state-management) later on to get a deeper understanding.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa60244f",
   "metadata": {},
   "source": [
    "## Configure the Y-axis.\n",
    "\n",
    "Now that we can configure the X-axis, we can repeat the same for the Y-axis. Try to do this yourself, without looking at the code, as a good practice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1157505b",
   "metadata": {},
   "outputs": [],
   "source": [
    "y_axis = solara.reactive(\"sepal_width\")\n",
    "\n",
    "\n",
    "@solara.component\n",
    "def Page():\n",
    "    fig = px.scatter(df, x_axis.value, y_axis.value)\n",
    "    solara.FigurePlotly(fig)\n",
    "    solara.Select(label=\"X-axis\", value=x_axis, values=columns)\n",
    "    solara.Select(label=\"Y-axis\", value=y_axis, values=columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb348680",
   "metadata": {},
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "Page()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13b71701",
   "metadata": {},
   "source": [
    "## Interactive plot\n",
    "\n",
    "We now built a small UI to control a scatter plot. However, often we also want to interact with the data, for instance select a point in our scatter plot.\n",
    "\n",
    "We could look up in the plotly documentation how exactly we can extract the right data, but lets take a different approach. We are simply going to store the data we get from `on_click` into a new reactive variable (`click_data`) and display the raw data into a Markdown component."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e74ce31e",
   "metadata": {},
   "outputs": [],
   "source": [
    "click_data = solara.reactive(None)\n",
    "\n",
    "\n",
    "@solara.component\n",
    "def Page():\n",
    "    fig = px.scatter(df, x_axis.value, y_axis.value)\n",
    "    solara.FigurePlotly(fig, on_click=click_data.set)\n",
    "    solara.Select(label=\"X-axis\", value=x_axis, values=columns)\n",
    "    solara.Select(label=\"Y-axis\", value=y_axis, values=columns)\n",
    "    # display it pre-formatted using the backticks `` using Markdown\n",
    "    solara.Markdown(f\"`{click_data}`\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84497014",
   "metadata": {},
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "Page()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "299460db",
   "metadata": {},
   "source": [
    "### Inspecting the on_click data\n",
    "\n",
    "Click a point and you should see the data printed out like:\n",
    "\n",
    "```python\n",
    "{'event_type': 'plotly_click', 'points': {'trace_indexes': [0], 'point_indexes': [32], 'xs': [5.2], 'ys': [4.1]}, 'device_state': {'alt': False, 'ctrl': False, 'meta': False, 'shift': False, 'button': 0, 'buttons': 1}, 'selector': None}\n",
    "```\n",
    "\n",
    "From this, we can get the row index, and the x and y coordinate.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "650752f6",
   "metadata": {},
   "outputs": [],
   "source": [
    "click_data = solara.reactive(None)\n",
    "\n",
    "\n",
    "@solara.component\n",
    "def Page():\n",
    "    fig = px.scatter(df, x_axis.value, y_axis.value)\n",
    "    solara.FigurePlotly(fig, on_click=click_data.set)\n",
    "    solara.Select(label=\"X-axis\", value=x_axis, values=columns)\n",
    "    solara.Select(label=\"Y-axis\", value=y_axis, values=columns)\n",
    "    # display it pre-formatted using the backticks `` using Markdown\n",
    "    if click_data.value:\n",
    "        row_index = click_data.value[\"points\"][\"point_indexes\"][0]\n",
    "        x = click_data.value[\"points\"][\"xs\"][0]\n",
    "        y = click_data.value[\"points\"][\"ys\"][0]\n",
    "        solara.Markdown(f\"`Click on index={row_index} x={x} y={y}`\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4a1db95d",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "Page()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4762caa2",
   "metadata": {},
   "source": [
    "## Displaying the nearest neighbours\n",
    "\n",
    "We now have the point we clicked on, we will use that to improve our component, we will.\n",
    "\n",
    "   1. Add an indicator in the scatter plot to highlight which point we clicked on.\n",
    "   2. Find the nearest neighbours and display them in a table.\n",
    "  \n",
    "For the first item, we simply use plotly express again, and add the single trace it generated to the existing figure (instead of displaying two separate figures).\n",
    "\n",
    "We add a function to find the `n` nearest neighbours:\n",
    "\n",
    "```python\n",
    "def find_nearest_neighbours(df, xcol, ycol, x, y, n=10):\n",
    "    df = df.copy()\n",
    "    df[\"distance\"] = ((df[xcol] - x)**2 + (df[ycol] - y)**2)**0.5\n",
    "    return df.sort_values('distance')[1:n+1]\n",
    "```\n",
    "\n",
    "We now only find the nearest neighbours if `click_data.value` is not None, and display the dataframe using the [`DataFrame`](/documentation/components/data/dataframe) component.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6bfbb986",
   "metadata": {},
   "outputs": [],
   "source": [
    "click_data = solara.reactive(None)\n",
    "\n",
    "\n",
    "def find_nearest_neighbours(df, xcol, ycol, x, y, n=10):\n",
    "    df = df.copy()\n",
    "    df[\"distance\"] = ((df[xcol] - x) ** 2 + (df[ycol] - y) ** 2) ** 0.5\n",
    "    return df.sort_values(\"distance\")[1 : n + 1]\n",
    "\n",
    "\n",
    "@solara.component\n",
    "def Page():\n",
    "    fig = px.scatter(df, x_axis.value, y_axis.value, color=\"species\", custom_data=[df.index])\n",
    "\n",
    "    if click_data.value is not None:\n",
    "        x = click_data.value[\"points\"][\"xs\"][0]\n",
    "        y = click_data.value[\"points\"][\"ys\"][0]\n",
    "\n",
    "        # add an indicator\n",
    "        fig.add_trace(px.scatter(x=[x], y=[y], text=[\"⭐️\"]).data[0])\n",
    "        df_nearest = find_nearest_neighbours(df, x_axis.value, y_axis.value, x, y, n=3)\n",
    "    else:\n",
    "        df_nearest = None\n",
    "\n",
    "    solara.FigurePlotly(fig, on_click=click_data.set)\n",
    "    solara.Select(label=\"X-axis\", value=x_axis, values=columns)\n",
    "    solara.Select(label=\"Y-axis\", value=y_axis, values=columns)\n",
    "    if df_nearest is not None:\n",
    "        solara.Markdown(\"## Nearest 3 neighbours\")\n",
    "        solara.DataFrame(df_nearest)\n",
    "    else:\n",
    "        solara.Info(\"Click to select a point\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "334d22b1",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "## solara: skip\n",
    "Page()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "3f54047370d637df4a365f9bae65e296d7b1c0737aca7baed81d825616d991e7"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
